Spark Cluster 三种模式

Spark Cluster 三种模式

  • Standalone – a simple cluster manager included with Spark that makes it easy to set up a cluster.
  • Apache Mesos – a general cluster manager that can also run Hadoop MapReduce and service applications.
  • Hadoop YARN – the resource manager in Hadoop 2.

Standalone 模式:

start Master:

1
./sbin/start-master.sh

start Slave:

1
./sbin/start-slave.sh <master-spark-URL>

  • sbin/start-master.sh - Starts a master instance on the machine the script is executed on.
  • sbin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file.
  • sbin/start-slave.sh - Starts a slave instance on the machine the script is executed on.
  • sbin/start-all.sh - Starts both a master and a number of slaves as described above.
  • sbin/stop-master.sh - Stops the master that was started via the bin/start-master.sh script.
  • sbin/stop-slaves.sh - Stops all slave instances on the machines specified in the conf/slaves file.
  • sbin/stop-all.sh - Stops both the master and the slaves as described above.

Connecting an Application to the Cluster:

1
./bin/spark-shell --master spark://IP:PORT

Launching Spark Applications:

1
./bin/spark-class org.apache.spark.deploy.Client kill <master url> <driver ID>

Resource Scheduling:

1
2
3
4
5
val conf = new SparkConf()
.setMaster(...)
.setAppName(...)
.set("spark.cores.max", "10")
val sc = new SparkContext(conf)

Running Spark on Mesos 模式:

参考: http://spark.apache.org/docs/latest/running-on-mesos.html

Installing Mesos:

  • Spark 2.0.1 is designed for use with Mesos 0.21.0. http://mesos.apache.org/gettingstarted/
    Connecting Spark to Mesos:
  • To use Mesos from Spark, you need a Spark binary package available in a place accessible by Mesos, and a Spark driver program configured to connect to Mesos.
    Uploading Spark Package:
  • Download a Spark binary package from the Spark download page
  • Upload to hdfs/http/s3
    To host on HDFS, use the Hadoop fs put command:
    hadoop fs -put spark-2.0.1.tar.gz /path/to/spark-2.0.1.tar.gz
    Using a Mesos Master URL:
  • The Master URLs for Mesos are in the form mesos://host:5050 for a single-master Mesos cluster, or mesos://zk://host1:2181,host2:2181,host3:2181/mesos for a multi-master Mesos cluster using ZooKeeper.
    Client Mode:
    1
    2
    3
    4
    5
    6
    val conf = new SparkConf()
    .setMaster("mesos://HOST:5050")
    .setAppName("My app")
    .set("spark.executor.uri", "<path to spark-2.0.1.tar.gz uploaded above>")
    val sc = new SparkContext(conf)
    ./bin/spark-shell --master mesos://host:5050

Cluster Mode:

1
2
3
4
5
6
7
8
9
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master mesos://207.184.161.138:7077 \
--deploy-mode cluster \
--supervise \
--executor-memory 20G \
--total-executor-cores 100 \
http://path/to/examples.jar \
1000 \

Running Spark on YARN 模式:

Cluster mode:

1
2
3
4
5
6
7
8
9
10
11
$ ./bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]
$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode cluster \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores 1 \
--queue thequeue \
lib/spark-examples*.jar \
10

Client mode:

1
$ ./bin/spark-shell --master yarn --deploy-mode client